Performance of Orthopaedic Shoulder and Elbow Surgeons on a Biostatistical Knowledge Examination

Background The objective of this study is to evaluate the biostatistical interpretation abilities of fellowship trained orthopaedic surgeons. Methods A cross-sectional survey was administered to orthopaedic surgeon members of the American Shoulder and Elbow Surgeons (ASES), assessing orthopaedic surgeon attitudes towards biostatistics, confidence in understanding biostatistics, and ability to interpret biostatistical measures on a multiple-choice test. Results A 4.5% response rate was achieved with 55 complete survey responses. The mean percent correct was 55.2%. Higher knowledge test scores were associated with younger age and fewer years since board exam completion (p ≤ 0.001). Greater average number of publications per year correlated with superior statistical interpretation (p=0.009). Respondents with higher self-reported confidence were more likely to accurately interpret results (p ≤ 0.017). Of the respondents, 93% reported frequently using statistics to form medical opinions, 98% answered that statistical competency is important in the practice of orthopaedic surgery, and 80% were eager to continue learning biostatistics. Conclusions It is concerning that fellowship-trained shoulder and elbow surgeons, many of whom frequently publish or are reviewing scientific literature for publication, are scoring 55.2% correctly on average on this biostatistical knowledge examination. Surgeons that are further from formal statistical knowledge training are more likely to have lower biostatistical knowledge test scores. Respondents who published at the highest rate were associated with higher scores. Continuing medical education in biostatistics may be beneficial for maintaining statistical knowledge utilised in the current literature.


Introduction
Evidence-based medicine (EBM) relies on physicians to have a deep understanding of the literature.Although some clinical practice guidelines relay bottom-line summaries of relevant research [1,2], many clinical questions must be answered by accessing original research [3].Tis process calls for physicians to critically evaluate research quality, including study design, conduct, and analysis.More importantly, and perhaps most challenging, physicians must determine how the research applies to their own practice.
While distinct from orthopaedic surgeons and our study population, reports have shown that practicing physicians, especially those who lack formal training in biostatistics and epidemiology, had an overall poor understanding of routine statistical terms and a limited ability to interpret study results [4][5][6].Te majority of medical schools have since incorporated basic biostatistics courses into their curriculum [7]; however, over that same time, an increased focus on academia has led to a surge in publications [8,9].As a result, authors have integrated complex statistical methods in an efort to set themselves apart [10].Tese issues increase the difculty for reviewers to appraise research methodology in studies, and it is plausible that researchers intentionally complicate their methodology to push past reviewers.In a letter to the editor, Horton and Switzer [10] reported that statistical methods used in published works between 2004 and 2005 increased in complexity.Specifcally, of the methods they observed, only 21% would be expected to be covered in an introductory statistics course.
As part of the American Board of Orthopaedic Surgery (ABOS) Part I accrediting examination, orthopaedic surgeons may encounter 0.5-1.5% of questions referencing biostatistics [11].Te ABOS exam tests the interpretation of epidemiologic information, associations, health impact, study design and interpretation, types of observational studies, sampling and sample size, subject selection, exposure allocation, hypothesis testing, and statistical interference [11].Additionally, recertifcation examinations place an important focus on current literature and guidelines [12].Despite the decisions physicians make on a daily basis weighing heavily on what the literature has proven, statistical competency among practicing physicians has been largely unassessed.
A sound understanding of statistics and the interpretation of data are crucial in making decisions and predictions based on results presented in the literature.Te purpose of this study was to assess the understanding of biostatistics in shoulder and elbow surgeons and current fellows.Specifcally, we surveyed their ability to interpret data and identify statistical terminology.We also gathered their subjective attitudes toward statistics and confdence in understanding the various topics.

Methods
We conducted a cross-sectional survey that was administered to members of the American Shoulder and Elbow Surgeons (ASES) via email.All surveys were conducted online in an unmonitored environment.Tis study was approved by the Institutional Review Board for Human Studies at Orlando Health Medical Centre and the University of Central Florida College of Medicine (ORA#1640952).All data can be accessed via the deidentifed Qualtrics data reporting service.

Survey Design.
Participants were asked to complete all four sections of our survey.Sections included participant (1) demographics and education, (2) perception of statistics, (3) confdence in the ability to understand various statistical concepts, and (4) the biostatistics knowledge examination (BKE).
In the frst section, we collected data regarding participant gender, age, fellowship subspeciality, years in practice, professional degrees held, training location, number of publications, and involvement with peer-review process and medical education (Table 1).In section two, we used a Likert scale (strongly disagree, disagree, neutral, agree, and strongly agree) to examine the opinions shared by participants as it relates to the value of statistics.Section 3 evaluated the self-reported participant confdence in statistical interpretation.We used a fve-point scale with zero symbolising no confdence and fve symbolising full confdence.
Lastly, the fourth section included the BKE, which was developed by Windish et al.,and has been shown to ofer good discriminative ability of statistics knowledge [13].Te examination tests commonly used methods and the understanding of terms encountered in statistics.Te BKE consisted of 20 multiple-choice questions presented in a vignette-type fashion.No calculations were required.Windish adopted two questions from a Danish study with a similar focus [6].Several were generated from the course material at John Hopkins Bloomberg School of Public Health [14], and the remainder tested concepts used in publications across six medicine journals with high impact factors (American Journal of Medicine, Annals of Internal Medicine, BMJ, JAMA, Lancet, and New England Journal of Medicine).Te specifc topics tested are shown in Table 2.

Data Collection and Analysis.
A Qualtrics XM-based survey was administered by email to all members of the ASES and all responses were collected anonymously.Respondents were limited to practicing orthopaedic surgeons and orthopaedic surgery fellows.
An a priori statistical power analysis was performed and determined a sample size of 52 participants which was necessary to achieve a power of 0.8 (anticipated efect size � 0.7, probability level � 0.05).Te survey was administered on January 5th, 2022, and remained open through February 2022.Participation was optional.Data were recorded and analysis was performed using R Core Team version 4.1.3(R Foundation for Statistical Computing, Vienna, Austria).Participants received a report following survey completion, which provided a performance score for the BKE and identifed correctly and incorrectly answered questions.
Only fully completed surveys were used in our analysis.Each question was analysed individually across the surveyed cohort.In addition, we assessed whether an association existed between various factors gathered from our other data points and BKE scores using bivariate and multivariate analyses.Variable selection for multivariate analyses was conducted using a forward stepwise selection to determine factors most strongly associated with correct BKE scores on univariate analyses [15].Tose with p values <0.1 were assessed in multivariate cohorts, grouped by demographic variables and confdence/attitude scores.Diferences among participant characteristics and BKE performance were analysed using a student's t-test or a one-way analysis of variance (ANOVA).Participant characteristics were used as independent variables while the mean correct percentage on the BKE was evaluated as the dependent variable.Analyses were corrected for multiple comparisons to minimise the efect of random chance infuencing the results.Variables with p values remaining ≤0.05 on all comparisons were deemed to remain signifcant without infuence from random chance.

Demographics.
Of the 55 surgeons (4.5% response rate from distribution) who completed the survey, 92.7% were male and 80% received their medical training from within the United States.Te highest proportion of participants was from 40 to 49 years (30.9%),followed by 30-39 years (27.3%).Te majority of respondents reported to have been fellowship-trained in shoulder and elbow surgery (52.7%).In addition, most held a Doctor of Medicine (MD) designation (87.3%), and a quarter of these participants reported additional degrees.Over 70% reported publishing at least one manuscript per year, with 16% of all respondents publishing nine or more annually.Similarly, 52 of the participants (94.5%) had at one point been involved in the peer review process for various journals, and greater than 80% have had some degree of involvement with medical education (Table 1).2).Cronbach's alpha of 0.78 demonstrated a good internal consistency of the BKE within the study population.Tis value indicates that participants who score well on most items are likely to continue to do so on the remaining items, as the test questions have high internal consistency.

Factors Associated with Statistical Knowledge.
Younger participants scored signifcantly higher on the BKE compared to older examinees across all analysed age cohorts (p < 0.001).Additional statistically signifcant diferences in BKE performances were identifed when compared against time since medical school graduation and ABOS examination completion.Surgeons with more scholarly activity (>9 publications annually) were also found to have signifcantly higher scores on the BKE (p � 0.009) (Table 1).Comparing scores among gender did not note a signifcant diference on the BKE.However, through logistic regression analysis, female gender, early career surgeons, and scholarly activity were predictors of higher BKE scores (p < 0.001, p < 0.001, and p � 0.009, respectively).Te proportion of explained variance for the models was large, with R 2 � 0.60, indicating a high amount of score variance due to analysed factors.

Attitudes and Self-Reported Confdence.
Participants overall shared a positive opinion on the value of statistics, with 98.2% agreeing that statistical competency is important and 80% favouring continued education in statistics.Over 90% of respondents reported that statistics help guide medical decision making in their practice.Attitudes toward statistics were not signifcantly associated with higher BKE scores (Table 3).Participants who reported high confdence in interpreting statistical results (p � 0.009) and assessing the correct statistical test (p � 0.017) demonstrated signifcantly higher BKE scores than those who did not.In contrast, participants who claimed high confdence in their ability to identify factors infuencing the power of a study had signifcantly lower BKE scores (p � 0.011) (Table 3).

Discussion
4.1.Background and Rationale.Te present study employed a cross-sectional survey to assess the understanding and confdence members of the American Shoulder and Elbow Surgeons have in biostatistics.Reports have surfaced suggesting that care providers often misinterpret statistical methods and outcomes, which would question their ability to make sound evidence-based decisions [16].It is thus timely and important to evaluate how fellowship-trained orthopaedic surgeons perceive their aptitudes for understanding biostatistics in the literature and how they perform when administered a BKE.Our study population largely consisted of academic orthopaedic shoulder and elbow surgeons, many of whom publish frequently (>70% reported publishing at least one manuscript per year and 16% publish 9 or more per year) or are a part of reviewing scientifc research for publication, which includes critiquing of statistical analyses.It was concerning that this subset of surgeons scored 55.2% correctly on average using the BKE, as potentially fawed research may be distributed to other academic or community surgeons with errant conclusions often taken at face value.One could infer that if our study population scored only 55.2% correctly, nonacademic and  Advances in Orthopedics community surgeons would likely score lower.When interpreting data to inform their practice, these surgeons may rely on studies' conclusions if they are unfamiliar with the statistical methodology.Furthermore, the recent surge in publication volume and complexity of statistical analyses only compound the aforementioned problems.When appropriate, the use of basic statistical methods and a thorough description of complex methods may improve reader understanding of study data.While continued medical education could beneft surgeons' understanding of biostatistics, it is unclear whether this would be sufcient with the evolving statistical complexity in studies or if the average orthopaedist would be interested.However, targeting journal reviewers with screening, testing, or other additional qualifcations prior to commenting on a study's methodology may improve the clarity and correctness of statistical interpretation in orthopaedic literature.

4.2.
Limitations.Tis study had several limitations, most of which were due to the survey format.Although many important concepts were covered in the BKE, the examination was limited to 20 questions and used topics commonly represented in medicine journals rather than orthopaedic surgery journals (e.g., Kaplan-Meier analysis is rarely utilised in orthopaedic studies and only 12.7% of our cohort answered this topic correctly).Te limited volume of questions suggests that the current BKE may not accurately represent the true overall knowledge that our participants have in regard to biostatistics.Te study population was limited to only ASES members, which included practicing surgeons and fellows in ASESaccredited programs; this limitation may have skewed results and may not be generalisable to the general orthopaedic surgery physician population.In addition, the present study found that women performed better than men on the BKE.Tis difers from results reported by Windish et al. [13], which found no diference.Further research is needed to understand if this is unique to the female members of the ASES or a generalisable fnding.Volunteer bias may have also played a role for both the female and male participants, which may not be representative of the entire ASES membership.Te current questionnaire achieved only a 4.5% survey response rate which may introduce signifcant selection bias and reduce external validity.Tis selection bias may have also led to results that were not refective of the majority of orthopaedic surgeons.Te prior Windish et al. study using the BKE was administered in-person during residency noon conferences to assure attendance, with rates over 70% [13].Due to the lack of incentive to complete the study questionnaire in our cohort and online administration, fewer responses were expected to be collected [1,2].Data errors resulting from omitted questions may skew fnal results; however, we did manage to achieve sufcient overall power with a sample size of 55.Lastly, this test was administered online for remote completion.Tis may have allowed participants to utilise outside resources to assist with answering questions and overestimated BKE scores.

Biostatistics Knowledge Examination.
Naturally, some statistics concepts are more easily comprehended than others.Participants were better able to interpret relative risk and recognise a double-blind study than interpret Kaplan-Meier analysis results or a 95% confdence interval and the corresponding statistical signifcance.Only 27.3% of respondents could identify a Chi-squared test, which is essential for many orthopaedic studies, and 9.1% of respondents correctly interpreted 95% confdence intervals and statistical signifcance.Tis result is similar to that which Windish et al. reported [13], where 25.6% and 11.9% of surveyed internal medicine residents answered these questions correctly for Chi-squared and confdence intervals, respectively.In addition, the respondents in their study similarly performed best when asked to interpret relative risk and recognise a double-blind study.Te above fnding may come as a surprise considering that Chi-squared tests, confdence intervals, and claims for statistical signifcance are commonly encountered concepts in the literature.However, misinterpretation of confdence intervals has been previously reported [17].Te failure to accurately interpret these results for both our current analysis and by Windish et al. may be attributed to confusing verbiage used in the questions and answers themselves.Te BKE used three questions to test participants' knowledge of these principles.Answers were non-numerical and required participants to select an answer from a list of distractors that were similarly worded, which required examinees to know the precise defnition for the tested concepts.However, one could argue that testing the nuances and specifcs of these concepts is critical to understanding and applying them to studies and that physician misinterpretation is a failure to understand these tested concepts.While these fndings may raise concern, there is no evidence looking at the consequence that misinterpretation could have on a physician's ability to apply research fndings safely and efectively into his or her own practice.Over 90% of our respondents reported using statistics to guide medical decision-making, though less than 10% tested correctly on interpreting confdence intervals and statistical signifcance.Furthermore, we reported that participants with higher confdence in their ability to interpret study power were associated with lower BKE scores.Tis false confdence could lead surgeons to errant conclusions and medical decision-making.Further research will be needed to determine if a novel question format could more accurately assess the ability to understand and interpret confdence intervals and claims of statistical signifcance.

Factors Associated with Statistical Knowledge.
In 2020, applicants who successfully matched into a United States orthopaedic surgery residency had an average 14.3 publications, presentations, and posters, highlighting the early engagement orthopaedic surgeons have with research participation and interpretation [13].However, Ngaage et al. found that in successfully matched orthopaedic surgery residents, the median number of publications was 1 and that 40% did not hold any publications, demonstrating a dichotomy between works reported and actually completed [18].Tis fnding may undermine the value of earlier 6 Advances in Orthopedics exposure and participation in research of orthopaedic surgeons as there is a push for increased academic productivity at the cost of reduced project responsibility and a shift toward faster published articles.
Te respondents in the current study were all fellowshiptrained or currently in fellowship, which may ofer additional opportunities for biostatistics education.Our study participants' large academic involvement may have resulted in higher scores than the average orthopaedic surgeon.Windish et al. [13] also discovered that BKE performance trended downward as individuals were further out from their formal medical education.Tis may suggest an opportunity for improvement in the continuing medical education structure.Tis has been addressed in recent years; medical education leaders have taken strides to incorporate biostatistics as a part of the formal curriculum for orthopaedic trainees [11,19].Further research will be needed to understand the individual efect of medical education on the understanding and interpretation of biostatistics for practicing surgeons.

Attitudes and Self-Reported Confdence.
Our study demonstrated a consensus by 98% of respondents, which showed that statistical competency is important.However, BKE performance was poor overall with respondents averaging only 11 out of 20 questions.Tese fndings indicate that future work should investigate methods to continue statistical education among orthopaedic surgeons.

. Conclusions
Tis study assessed the biostatistical knowledge of fellowship-trained shoulder and elbow surgeons, many of whom frequently publish or are reviewing scientifc literature for publication.Tis population scored an average of 55.2% correctly, raising concern that some of the most research-literate, academic shoulder, and elbow surgeons lack basic statistical understanding.Tis may yield implications that potentially fawed research is being distributed to other academic or community surgeons with subsequent errant conclusions.Future directions to improve research reliability and reader understanding include thorough descriptions of research methods and limitations of such used in studies, as well as utilisation of basic statistical methods when appropriate.While continuing medical education may also beneft orthopaedic surgeons, it is unclear if this would be sufcient or if the average orthopaedic surgeon would be interested.Targeting orthopaedic journal reviewers with screening or other additional qualifcations prior to commenting on statistical methodology could improve the clarity of results published and distributed to orthopaedic surgeons at large.
Our study demonstrates that younger surgeons, female surgeons, and those with a greater number of publications per year scored higher on the BKE.Improved scores for younger respondents and those closer to their time in training may be due to familiarity with biostatistics and enhanced emphasis of statistical education in modern medical school, residency, and fellowship curricula.Further research is needed to understand the efect of gender, medical education, phase of career, speciality, and subspeciality on physicians' level of understanding of biostatistics.
Nearly all respondents felt that statistics are important; this current study highlights that further work is needed to educate surgeons on how to interpret biostatistics.Improving our ability to work with statistics will allow surgeon researchers to continue driving the feld of orthopaedic surgery towards better and safer treatments.

Table 1 :
Demographic data and correlation with biostatistical knowledge exam scores.
Advances in Orthopedics 33.2.Biostatistics Knowledge Examination.Overall, 55.2% of questions on the BKE were answered correctly.Participants performed best when asked to interpret relative risk (96.4% correct) and in recognising a double-blind study (92.7% correct).Te two areas where participants performed the weakest were when interpreting a 95% confdence interval and statistical signifcance (9.1%) and when asked to interpret Kaplan-Meier analysis results (12.7%) (Table

Table 2 :
Percentage of correct answers for biostatistical knowledge examination.

Table 3 :
Attitudes and confdence of statistical competency.
Bold values represent statistically signifcant results.